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WO 96/35777 PCT/US96/06569 

CHIMERIC ENZYME FOR PROMOTING TARGETED 
INTEGRATION OF FOREIGN ONA INTO A HOST GENOME 



Pursuant to 35 U.S.C. §202 (c) , it is hereby 
acknowledged that the U.S. Government has certain 
rights in the invention described herein, which was 
made in part with funds from the National Institutes of 
5 Health. 

FIELD OF THE INVENTION 

The present invention relates to genetic 
modification of eucaryotic organisms, in particular, 
10 this invention provides novel fusion proteins 

comprising a target-specific DNA binding moiety and a 
retroviral integrase moiety. The enzyme is useful for 
facilitating target-specific integration of foreign DNA 
into a host genome . 

15 

BACKGROUND OF THE INVENTION 

The practical utility of genetic engineering 
often depends on introducing inheritable genetic traits 
into organisms. To achieve this goal, foreign DNA must 
be stably integrated into the DNA of the host organism. 
Stable integration of foreign DNA into host DNA is 
often referred to as "transformation" of the host cell 
(or genome of the cell) . 

Genetic transformation in higher eucaryotes 
25 is often accomplished through the use of viral vectors 
which rely on stable integration in the host genome as 
part of their replicative cycle. Retroviruses are one 
of the few animal viruses that depend upon integration 
for replication. A number of retroviral vector systems 
are currently available to mediate transformation of 
animal genomes. Such systems utilize one or more 
vectors , at least one of which contains the portion of 
the retroviral genome responsible for integration of 
the viral genome into the host genome. 
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Integration of retroviral DNA requires a 
virus-encoded enzyme , the integrase (IN) , which is 
encoded by the viral pal gene and carried within the 
virus particle, (For a review of the retroviral 
5 enzymes, including integrase, see Katz & Skalka, Ann. 
Rev. Biochem. 63: 133-173, 1994) . Integration also 
requires cis-acting sequences at the ends of linear 
viral DNA. Integration is site-specific with respect 
to the viral DNA (it occurs at the linear ends) , but 
10 appears to be nearly random with respect to host DNA. 

Biochemical and genetic experiments indicate 
that integration takes place through two steps . First, 
IN nicks the viral DNA two nucelotides from the 3 8 ends 
of each DNA strand (referred to as the "processing" 
15 reaction) . This nicking exposes the highly conserved 
CA dinucleotides, usually located two nucleotides from 
the 3 * end of each strand . The new 3 1 -OH ends of each 
viral DNA strand are then joined to the host DNA in a 
second reaction (referred to as the "joining" 
20 reaction) . The joining reaction is believed to proceed 
by a direct attack mechanism whereby the 3 1 -OH ends of 
viral DNA strands attack host DNA phosphates that are 
staggered by 4-6 base pairs . The simplest model for IN 
function is one in which a single monomer is bound to 
25 each end of viral DNA and each monomer is capable of 
binding viral DNA and host DNA simultaneously . 

Both processing and joining activities can be 
assayed in vitro using short synthetic DNA substrates 
that mimic the single ends of retroviral DNA (see Katz 
30 & Skalka, 1994 , supra) . Both reactions are thought to 
be catalyzed by a single active site, due to the 
chemical similarity of the two reactions and the 
general inability to biochemically separate the two 
activities by mutagenesis. 
35 in is the only viral gene product required 

for integration of viral DNA into a host genome. For 
this reason, IN may be used to advantage to facilitate 
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genetic transformation of eucaryotic cells. However, 
its utility is limited due to its lack of sequence 
specificity with respect to the host DNA. That is, IN- 
catalyzed integration can occur essentially at random 
5 in the genome, which could result in activation or 
deactivation of host genes essential for cellular 
function. Thus, it would be a significant advance in 
the art of genetic transformation to develop retroviral 
integrases capable of site-specifically catalyzing 
10 integration of foreign DNA into a pre -determined 
location in the host genome. 

It is an object of the present invention to 
provide modified retroviral integrases capable of 
enhancing the integration reaction and catalyzing 
15 integration of foreign DNA at a selected target 

location in a host genome. it is further an object of 
the present invention to provide retroviral vectors 
that encode such modified integrases, and which also 
contain the foreign DNA to be inserted into the host 
20 genome. 

SUMMARY O F THE TNyBNTTOM 

The present invention provides novel chimeric 
genes and fusion proteins to enhance stable integration 

25 of foreign DNA into host DNA, and to promote site- 
specific integration at a selected location in a target 
DNA molecule. The compositions of the invention are 
particularly useful for enhancing integration of 
foreign DNA carried on retroviral vectors, which should 

30 be of wide utility as a research tool to study the 

organization of gene expression pathways, as well as 
for gene-based diagnostic and therapeutic purposes. 

According to one aspect of the present 
invention, a chimeric enzyme is provided, which 

35 comprises a DNA binding moiety and an integrase moiety 
derivable from a retroelement. The enzyme is capable 
of binding a DNA molecule having a characteristic 
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determinant recognized by the DNA binding moiety, and 
the enzyme possesses at least one activity 
characteristic of a retroelement integrase. 
Characteristic retroelement integrase activities 
5 include processing of retroelement DNA termini , nicking 
within double-stranded DNA, and integrating a DNA 
molecule having processed retroelement termini into 
another DNA molecule. In a preferred embodiment, the 
chimeric enzymes of the invention possess all three 

10 activities and integrates the processed DNA molecule 

into a site that neighbors the binding site recognized 
by the DNA binding moiety. 

In preferred embodiments of the invention, 
the chimeric enzyme described above is constructed so 

15 that the DNA binding' moiety is fused to the integrase 

moiety at either the amino- or carboxyl-terminus of the 
integrase moiety. Carbbxyl«terminal fusion proteins 
are particularly preferred because their encoding DNAs 
can be incorporated into repl icat ion-competent 

2 0 retroviral vectors . 

According to another aspect of the present 
invention, a nucleic acid molecule is provided, which 
has a sequence that encodes the chimeric enzyme 
described above . The nucleic acid molecule comprises a 

25 DNA binding moiety-encoding segment operably linked to 
an int egra se-encod ing segment derivable from a 
retroelement . The nucleic acid may be disposed within 
a vector , which may be a retroviral vector (either a 
replication competent vector or a "helper 1 * virus ) , a 

30 cloning vector or an expression vector . In a preferred 
embodiment , the nucleic acid is disposed within a 
retroviral vector . If the nucleic acid encodes a 
chimeric enzyme in which the DNA binding moiety is 
fused to the C-terminus of the integrase moiety r a 

35 retroviral vector may be constructed* These retroviral 
vectors may also contain a foreign DNA to be inserted 
into a host genome „ 
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The novel chimeric genes and fusion proteins 
of the present invention represent a significant 
advance in the art of stable integration of foreign 
genes into host genomes. Whereas current retroviral 
5 integrase systems are capable only of catalyzing random 
integration of foreign DNA at low efficiency, the 
modified integrases of the invention are capable of 
enhancing the integration reaction and catalyzing 
integration of foreign DNA at selected targets located 
10 in a host genome. 

BRIEF DESCRIPTION OF T HE DRAWINGS 

FIGURE 1. Figure 1A shows a schematic 
representation of avian sarcoma virus (ASV) in protein 
15 and various derivatives. The wild type IN protein, 286 
amxno acids in length, is depicted. The highly 
conserved amino acid residues are represented with the 
smgle letter code. Their position numbers and the 
domains that they define are also indicated. The 
20 putative catalytic residues are in bold. The filled 
boxes indicate the LexA repressor DNA binding domain 
(87 amino acids) . The numbering of the various 
derivatives reflects the amino acids that are retained. 
The dashed vertical lines indicate the first four amino 
25 acids of IN that are retained as a "leader" or "spacer" 
in the majority of deletion and fusion proteins. The 
hatched region in LIN 39-207 indicates the presence of 
two heterologous amino acids at the C-terminus. Figure 
IB shows the lex A operator sequence (Sequence I.D. No. 
30 1) used as a target for experiments shown in Figures 3 
and 4. Restriction sites for cloning into pBR322 are 
shown. The operator core sequences are shown in bold. 
The vertical lines indicate axis of dyad symmetry for 
each operator. Boxed region indicates the spacer 
35 between the two operators. 
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FIGURE 2 . In vitro processing assay for IN 
and IN derivatives . Fig. 2A: wild type IN, IN 18-286, 
L-IN 39-286, L-IN 52-286. Fig. 2Bt IN 39-207, L-IN 39- 
207. Time course in indicated, NP indicates a control 
5 incubation without protein. Reaction products were 
fractionated on polyacrylamide-urea sequencing gels* 
us" indicates labeled substrate strand from the duplex 
substrate that mimics the retroviral DNA end {see 
diagrams in lower parts of the figures, asterisk 

10 indicates 32 P label) . ASV IN nicks the substrate 

between the A and T (as indicated by arrow in diagram 
Fig. 2A) , releasing the TT dinucleotide. This produces 
a strand two nucleotides shorter than the substrate 
(denote M -2 M band) . The "-3" nicking (Fig. 2B) 

15 represents a non-specific activity which is prominent 
when using the preferred Mn++ ion as a co- factor; the 
catalytic domain fragment retains this fl -3 ff activity. 

FIGURE 3 . Detection of in vitro integration 
20 events using a PCR-based assay. Labeled PCR products 
were fractionated on 7% polyacrylamide-urea sequencing 
gels. IN or IN fusions indicated above the lanes were 
incubated with the target plasmid containing the lexA 
operator segment depicted in Figure IB. The bands 
25 corresponding to integration sites within the lex A 

operators were determined by using the indicated size 
markers ( indicated in base pairs) as well as a ten-base 
ladder (not shown) . The estimated borders of this 
region are indicated on the left . Arrows indicate 
30 bands corresponding to enhanced integration sites 
associated with N-LIN and C-LIN proteins. 

FIGURE 4 . Detection of in vitro integration 
events using a PCR-based assay. Proteins used in each 
35 assay are shown above. A control reaction was 

incubated without IN (lane 1,-IN) . Symbols are as 
described in Figure 3. 
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FIGURE 5. Nicking (double strand break) 
assay for ASV IN and derivatives. Supercoiled 
substrate (form I, FI) was used in Figures 5A and 5B. 
Linear substrate was used in Figure 5C. Proteins used 
5 for reactions are indicated above lanes. Fig. 5A: 
plasmid (pNDE-l), a pBR322 derivative lacking lexA 
operators was used as a substrate. DNA forms (F) I, n 
and III are indicated. Marker DNA sizes are indicated 
m base pairs. Lane 1, marked CON, indicates that the 
reaction was incubated in the absence of in. Fig. 5B: 
The plasmid ( pBRBSLexAOP ) was used as a substrate." 
This plasmid contains several tandem lexA operators. 
The F II band in the starting material (CON, Lane l) 
may also contain supercoil dimer and a double strand 
break would produce a linear dimer [denoted F III ( D ?) ] 
Fig. 5C: as in A and B, except that a PstI I-linearized 
version of the pBRBSLexAOP plasmid was used as a 
substrate. Markers for linearization at the laxA 
operator region were generated by cleavage with PstI 
20 and Xhol. 
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FIGURE 6. Strategy for construction of an 
ASV genome encoding C-LIN. Top portion is a diagram of 
the env-pol overlap region of ASV. A small peptide of 
37 amino acids (stippled region) is normally removed by 
the viral protease (PR) from the C-terminus of IN after 
assembly. The env 3 '-splice site (ss) is indicated. 
Middle portion of the figure shows P LD6IS1, in which a 
stop codon has been introduced that corresponds to the 
proteolytic cleavage site and a noncoding spacer (nc) 
has been introduced between pal and env. Using pLD6ISl 
as an intermediate, pLD€ C-LIN was constructed by 
inserting a PCR-generated LexA DBD coding fragment 
(hatched) between existing Ban II and Afl n sites. in 
the bottom portion of the figure, the two PGR primers 
are shown, along with amino acids encoded at the IN- 
LexA DBD junction (vertical dashed line) (amino acid 



WO 96/35777 PCT/US96/06569 

- 8 - 

sequence is Sequence I.D. No, 2 , nucleotide sequence is 
Sequence I.D. No. 3) and at the new C-terminus of the 
C-LIN fusion ( amino acid sequence is Sequence I.D. No. 
4, nucleotide sequence from right to left is Sequence 
5 i.d. No. 5) . Sequences in bold indicate LexA DBD 
coding sequences and corresponding amino acids . 

FIGURE 7. Virus replication as measured by 
the reverse transcriptase assay after DNA transf ection 

10 and infection of CEFs. Fig. 7A: CEF cell were 

transfected with a wild type viral DNA pLD6 (WT) and 
the C-LIN construct pLD6ISl C-LIN (C-LIN) . Virus 
production was monitored by the reverse transcriptase 
assay. Control cultures (CON) were not transfected. 

15 Fig. 7B: Supernatant^ collected on day 25 from Fig. 7A 
were normalized for RT activity and were applied to 
fresh CEF cultures . Control (CON) cultures were not 
infected . Virus production was monitored by the 
reverse transcriptase assay. 

20 

FIGURE 8. Western blot analysis of virus 
collected at day 30 post transf ection. The viral clone 
used for the original transfection is indicated above 
each lane (see Figure 7) . Virus particles were 

25 collected by pelleting and were applied to a 10% SDS- 
polyacrylamide protein gel. Western blots were probed 
with (Fig. 8A) anti-IN or (Fig. 8B) ant i -LexA repressor 
protein rabbit polyclonal antibodies. The blots were 
developed with 125 I-labeled protein G. Molecular weight 

30 markers (in megadaltons) are indicated on the left side 
of each panel. 

DISM^EB. PBSCMgTTQH OF TOE IMVEMTXQM 

It has been discovered in accordance with the 
35 present invention that the retroviral integrase (IN) 
can be fused with a heterologous DNA binding moiety 
(DB) to form a chimeric enzyme capable of enhancing 
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integration efficiency and facilitating site-specific 
integration of a foreign DNA molecule into a target 
DNA. These chimeric enzymes can be constructed as 
precise fusion* of a DB to the N-terminus or C-terminus 
5 ° f retr ° viral IN - Alternatively, chimeric enzymes of 
the invention can be constructed by deleting a portion 
of the N-terminus of IN (specifically the ZF region 
which in ASV is up to approximately 4 0 amino acids from 
the N-terminus) and thereafter fusing the DB to the N- 
10 terminal truncated in. 

As described in the background section 
wildtype IN is capable of catalyzing all reactions 
necessary for integration of retroviral DNA. The 
precise c- and N-terminal fusion proteins of the 
15 present invention retain all activities ascribable to 
natxve IN, and are additionally capable of targeting a 
foreign gene to the target sequence recognized by the 
DNA binding moiety of the fusion protein and thereafter 
promoting the enhanced use of integration sites nearby 
20 the target DNA sequence. When optimized, these 

proteins may also enhance efficiency of the integration 
reaction, as well as facilitate targeted integration. 
Additionally, it has been discovered in accordance with 
the Present invention that a retrovirus engineered to 
encode a chimeric enzyme comprising a C-terminal fusion 
of IN and a DNA binding moiety is competent to 
replicate within host cells. 

N-terminal fusions of the DNA binding moiety 
to a location within the integrase 2F region are also 
capable of exerting most of the activities of native IN 
(i.e. Processing viral DNA termini, concerted nicking 
of target DNA and joining of viral DNA to target DNA) • 
however, integration catalyzed by these chimeric 
enzymes is not targeted to a neighboring site, as it is 
in the precise N- and C-terminal fusion proteins. 
These chimeric enzymes should find an additional 
utility as site-specific DNA cleaving agents, due to 
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their retention of the integrase concerted nicking 
activity in the vicinity of the DNA binding site. 

The chimeric enzymes of the present invention 
are useful in .a variety of forms for the purpose of 
5 facilitating stable integration of foreign DNA at a 
selected target position in a host chromosome or 
genome . In one embodiment, currently-available 
retroviral vectors can be modified such that they 
encode a DB/IN chimeric enzyme ( instead of a native 
10 integrase) . Such vectors may also encode the foreign 
DNA desired for insertion into a host genome, or they 
may be utilized as retroviral helper vectors to 
integrate foreign DNA provided on a separate gene 
transfer vehicle. Alternatively, purified enzymes of 
15 the invention may be 'provided as adjuvant with a 
foreign DNA of interest, to enhance targeted 
integration of the DNA. The chimeric enzymes of the 
invention are useful as research tools for studying the 
effect of insertional activation or disruption of genes 
20 in cultured cells or in animal model systems . More 

importantly, the chimeric enzymes of the invention can 
be used as diagnostic/therapeutic agents in a variety 
of currently available and developing gene transfer 
methodologies . 
25 The detailed description set forth below 

describes preferred methods for making and using the 
chimeric enzymes of the present invention. Any 
molecular cloning or gene transfer techniques not 
specifically described are carried out by standard 
30 methods, for example as generally set forth in Sambrook 
et al. , "DNA Cloning, A Laboratory Manual, 11 Cold Spring 
Harbor Laboratories, 1989 (hereinafter Sambrook et 
al.); and Ausubel et al. (Editors) , "Current Protocols 
in Molecular Biology," John Wiley & Sons, Inc. , 1995 
35 (hereinafter Ausubel et al.). 



WO 96/35777 PCT/US96/06569 

-11- 



lS 



20 



X * 22*2EIS2! 2L^ MBRIC enzymes comprising a 

SM^ BINDI NG. MOIETY AND AM INTEGRATE MOIETY 

The chimeric enzymes of the present invention 
5 COInprise two components: an integrase (IN) domain and a 
DNA binding (DB) domain. These domains may be arranged 
as precise C- or N-terminal fusions with respect to the 
integrase moiety, or may comprise N-terminal fusions 
wherein a portion of the IN N-terminus has first been 
10 deleted - In accordance with the present invention, it 
has been discovered that the ZF domain of IN 
(comprising approximately the first 40 amino acids of 
the protein in ASV) is not essential to maintain the 
activities of the IN moiety. The fusion proteins of 
the invention are sometimes referred to herein as 
" DB/IN" or as "N-DB/*N" or "C-DB/IN" to denote that the 
DNA binding moiety is attached at the N-terminus or the 
C-terminus of the IN moiety, respectively. Finally, 
chimeric enzymes that comprise an N-terminal fusion in 
which a portion of the IN moiety has been deleted are 
sometimes referred to herein as "DB/IN" followed by a 
pair of numbers, which refer to the amino acid residues 
included in the IN moiety of the fusion protein (for 
example, residues 18-286 of ASV IN fused to the LexA 
25 DNA binding domain, as described in Example 1, is 
referred to as LIN 18-286) . 

The db/in fusion proteins of the invention 
are prepared by recombinant DNA methods, in which DNA 
sequences encoding each domain are "operably linked" 
together such that upon expression, a fusion protein 
having the targeting and integrase functions described 
above is produced. At used herein, the term "operably 
linked" means that the DNA segments encoding the fusion 
protein are assembled with respect to each other, and 
with respect to an expression vector in which they are 
inserted (including retroviral vectors) , in such a 
manner that a functional fusion protein is effectively 
expressed. The selection of appropriate promoters and 
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other 5 1 and 3 1 regulatory regions, as well as the 
assembly of DNA segments to form an open reading frame, 
employs standard methodology well known to those 
skilled in the m art . 
5 Thus, preparing the chimeric enzymes of the 

invention involves selecting DNA sequences encoding 
each of the aforementioned components , operably linking 
the respective sequences together in an appropriate 
vector , and expressing the sequences to produce the 
10 chimeric enzyme. Each of these steps is described 
below. 

It will be appreciated by persons skilled in 
the art that the DNA components assembled for 
expressing the chimeric enzymes of the present 

15 invention can be prepared in a variety of ways, 
including DNA synthesis, cloning, mutagenesis, 
amplification, enzymatic digestion, and similar 
methods, all available in the standard literature* 
Additionally, certain of the components can be obtained 

20 easily through commercial sources or by access to 

public repositories, such as the American Type Culture 
Collection- Alternatively , components that are not 
readily available, and/ or for which sequence 
information is not available, can be isolated from 

25 biological sources using standard hybridization methods 
and homologous probes that are available. 

Due to the high level of conservation among 
retroviruses , retrotransposons and other retroelements , 
DNA sequences encoding the integrase moiety of the 

30 fusion proteins of the invention can be selected from 
any of these sources . As used herein, the term 
"retrovirus" refers to a class of viruses in which the 
genetic material is HNA, but which completes its 
replicative cycle by means of a DNA intermediate , which 

35 becomes integrated into the genomic DNA of host cells 
(see Background section) . The term "retrotransposon" 
refers to a class of mobile DNA elements that possess 



15 



WO 96/35777 PCT/US96/06569 

- 13 - 

long terminal repeats (LTRs, generally a few hundred 
base pairs in length) , which replicate through RNA 
intermediates that are copied by reverse transcriptase 
(a retrotransppson may be thought of as a retrovirus 
5 for which an infectious extracellular form does not 
exist, or has not yet been discovered) . The term 
"retroelement" refers collectively to retroviral DNA, 
retrotransposons and other transposable elements having 
the above-described characteristics. Retroelements 
10 encode at least one, and generally all, of the 

following three enzymes: reverse transcriptase (RT) , 
protease (PR) and integrase (IN) . Any retroelement ' 
that encodes an integrase may utilized as a source for 
DNA sequences encoding the IN moiety of the invention. 
Such retroelements include, but are not limited to: (1) 
retroviruses such as human T-cell leukemia virus (HTLV) 
types I and II, bovine leukemia virus (BLV) , simian 
retrovirus type I (SRV-I) , mouse mammary tumor virus 
(MMTV) , avian sarcoma virus (ASV) , human 

immunodeficiency virus (HIV) , human spuma (foamy) virus 
(HFV) , visna lentivirus (VISNA) , Moloney mouse leukemia 
virus (Mo-MLV) , feline immunodeficiency virus (FIV) 
caprine arthritis-encephalitis virus (CAEV) equine 
infectious anemia virus (EIAV) , human endogenous 
25 retrovirus (HERV) type C and type k, and hamster 
intercisternal particle (IAP-18) ; and (2) 
retrotransposons and other retroelements, such as Ty-l, 
Ty-3, copia, Gypsy, 297, 17.6, and 412. For a review ' 
of eucaryotic and procaryotic retroelements, see, e.g., 
30 Doolittle et al., Quarterly Review of Biology 64: l-3o' 
(1989) ; Garf inkle Chapter 4 in The Rrtm^^.. Volume 
1, J. A. Levy, Ed., Plenum Press, New York (1992) pages 
107-158. 

A variety of DNA binding proteins have been 
35 isolated and characterized in recent years. Any of 
these DNA binding proteins can serve as appropriate 
sources for the DNA binding moiety of DB/IN fusion 
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proteins. Moreover , as new DNA binding proteins are 
isolated and characterized , these may also been 
utilized to construct DB/IN proteins of the invention. 
For optimum practice of the present invention, it is 
5 important to select DNA binding domains that recognize 
one or more specific characteristic determinants of a 
DNA molecule, rather than binding randomly to DNA. As 
used herein, the term M character i st i c determinant" 
refers to one or more sequence or structural features 
10 unique to a particular gene or other specified location 
on a DNA molecule . As will be appreciated by those 
skilled in the art, such characteristic determinants 
generally involve a specific primary DNA sequence , as 
well as positioning of one or more such sequences 
15 relative to one another on a DNA double-helix, possible 
combined with other structural features of eucaryotic 
chromosomal DNA* 

Chimeric enzymes comprising the LexA DNA 
binding domain are described in detail in Example 1. 
20 These fusion proteins serve to demonstrate that 

integrase indeed can be constructed as a fusion protein 
with a DNA binding moiety to produce a chimeric enzyme 
that can facilitate site-specific integration of a 
foreign gene. However , the greater practical utility 
25 of the invention is directed to eucaryotic host DNA 
since retroviral vectors are used in eurcaryotic 
systems . Accordingly, DNA binding proteins targeted to 
eucaryotic genes are preferred for practice of the 
invention. Particularly preferred are eucaryotic 
30 transcription factors that possess DNA binding domains, 
such as those that act in concert with RNA polymerase, 
as described by Bur ley, Current Opinion in Structural 
Biology 4: 3-11 (1994) . These include, but are not 
limited to: TATA box binding proteins (TBP) , 
35 specifically TBP isoform 2? b/hlh/z factors, such as 
Max and USF? helix- turn-helix variants, such as the 
third repeat of c-Myb, the POU-specif ic domain Oct-1, 



WO 96/35777 PCT/US96/06569 

- 15 - 

the homeodomain from LSBi /mwtti »«j ^ 

of HNF-3-y • the DMA ! ' ^ the f ° rk hea <* domain 

7 ' the DNA binding domain of GATA-1 • t-h« 
nucleic acid bindino ~ ' the 

HS;and the 5 "nc fi transcription factor 

-tors sui L^^t^^ 
expr essi ±n cellular development and 

differentiation (see, e.g. D s 

T ™ ns = riPtlon Factors ,„ L; de D nl s ; p 1 :::^ Eu s k T otlc 

B ' KBF1 C/EBP and CRE binding proteins anrt o-v, 
transcription remn^nrc „ 1M ' and other 

puion regulators encoded by aenp <. Q1 ,„ K 
ets, fos -fun m „ K y genes such as erbA, 

' s ' Jun ' m Y*>> Wc, rel and spi-i. 

As discussed above dr/tm 
X5 invention may be utili 2e d as'purifL TZZ^ ^ 

DNA sequences encoding the pro\eTn to be i 17 7 " 
retroviral vector and expressed * 
enzymes. Thus, either retroviral v^n cellula * 
vector systems will . vectors or expression 

20 invention^ t0 Pr * Ctic * th. 

Dubl< , NUmer ° US ~*roviral vector systems are 
Publicly or commercially available. These systL, 
generally comprise vectors for carrvino It T 
of interest and a Daf >v, • drying the foreign DNA 

25 contains virliM 9 9 ° r helper line which 

contains viral sequences encoding trar^-^4-4 

(although some vectors comprise Ii^s el 9 Pr ° teinS 

d f - competence? Severa T^l^^ 

AS Lawrie & Temin, Curr on _ * 

30 109 (19931 ° P * Genet - Devel. 3: 102- 

xo9 (1993). Other are known in the art (see e „ 
Ausubel et al., Chapter 9.10, 9 .n,. ' ' 

8U . K , fc Ratr ° Viral vectors can be modified to 
substitute a DB/XN of the invention for the standard 
xntegrase gene carried on the vector, utm z W 
35 standard recombinant DNA technics 7,1 * 

the integrase gene carried Alternatively, 
be modified bv 3 retrovi ^ vector may 

Edified by adding sequences encoding a DNA binding 
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moiety at the N— or C-terminus , or within the N- 
terminal ZF domain, in accordance with the description 
set forth above . These substitutions or modifications 
may be carried^ out on retroviral vectors destined to 
5 contain the foreign DNA of interest or, alternatively , 
on "helper" retroviral vectors in a gene transfer 
system in which the foreign DNA of interest is carried 
on a separate vector . Modification of a retroviral 
vector to comprise a DB/IN-encoding gene of the 
10 invention is described in greater detail in Example 1. 

Retroviral vectors comprising DB/IN sequences 
are used according to standard methods to transfect 
eucaryotic cells (either cultured cells or cells within 
a living organism) . The chimeric enzyme of the 
15 invention is then expressed within the trans feet ed 
cells and is thus available to facilitate targeted 
integration of foreign genes carried on that same 
vector or a separate vector . 

The chimeric enzymes of the invention may 
20 also be produced using other in vitro expression 

methods known in the art. For example, DNA sequences 
encoding the protein may be cloned into an appropriate 
in vitro transcription vector , such as pSP64 or pSP65 , 
for in vitro transcription , followed by cell-free 
25 translation in a suitable cell-free translation system, 
such as wheat germ or rabbit reticulocytes • In vitro 
trans cr ipt i on and translation systems are commercially 
available, e.g. , from Promega Biotech, Madison, 
Wisconsin or BKL, Bockville, Maryland* 
30 Alternatively , according to a preferred 

embodiment, the chimeric enzymes may be produced by 
expression in a suitable procaryotic or eucaryotic 
cellular system. For example, DB/IN sequences may be 
inserted into a plasmid vector adapted for expression 
35 in a bacterial cell, such as B» coli, or into a 

baculovirus vector for expression in an insect cell. 
Such vectors comprise the regulatory elements necessary 
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for expression of the dna in the bacterial or 
eucaryotic host cell, positioned in such a manner as to 
permit expression of the DNA in the host cell. 
Production of a chimeric enzyme of the invention by 
5 expression in a procaryotic system is describe in 
greater detail in Example 1. 

The protein produced by expression in a 
recombinant procaryotic or a eucaryotic system may be 
purified according to methods known in the art. in a 
10 preferred embodiment , a commercially available 

expression/secretion system can be used, whereby the 
recombinant protein is expressed and thereafter 
secreted from the host cell, to be easily purified from 
the surrounding medium. If expression/ secretion 
vectors are not used/ an alternative approach involves 
purifying the recombinant protein by affinity 
separation, such as by immunological interaction with 
antibodies that bind specifically to one or more 
moieties of the recombinant protein. Such methods are 
20 commonly used by skilled practitioners. Purification 
of a chimeric enzyme of the invention after expression 
in a procaryotic system is described in greater detail 
in Example l. 

25 METHODS OF USING DB/IN ENZYMES, 

AND SPECIFIC APPLICATION 

The methods of the invention generally 
involve combining host DNA (e.g., genomic DNA from 
higher eucaryotes) with foreign DNA in the presence of 

30 a chimeric enzyme of the invention in such a manner as 
to enable the enzyme to catalyze the site-specific 
integration of the foreign DNA into a selected target 
site of the host DNA. The foreign DNA is carried 
within a retroviral vector, or is modified to provide 

35 linear double-stranded segments that have terminal DNA 
sequences recognizable by the DB/IN enzyme. The enzyme 
thereafter supplies the necessary processing and 
joining catalytic functions for integration into the 
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pre-determined target, site, and cellular repair enzymes 
(e.g. , DNA polymerases and ligases) provide the 
requisite gap-filling and ligation functions to 
accomplish full integration. 
5 The chimeric enzymes of the invention can be 

utilized for site-specific modification of host DNA 
either extracellular ly or intracellularly . In the 
extracellular application, host DNA is removed from 
cells or otherwise purified, either as a total DNA 

10 fraction or as intact chromosomes , and combined with 
the foreign DNA of interest and the chimeric enzymes 
(or gene encoding the enzyme) in a suitable biological 
buffer, along with any other biological reagents 
necessary for completing integration of the foreign DNA 

15 into the host DNA. This in vitro utility of the 

chimeric enzymes of the invention should find broad 
application in preliminary research, in which the 
effect of targeted, stable insertion of a foreign DNA 
into a host gene or chromosome is being studied or 

20 explored* 

In most applications of the invention, it is 
preferable to utilize the DB/IN chimeric enzymes in 
situ, i.e. , within cells containing host DNA. To 
accomplish this, the chimeric enzyme is introduced into 

25 cells by expression from a retrovirus helper or vector , 
or as protein or DNA. For a retroviral helper, a ceil 
line is created that expresses all retroviral proteins , 
with the DB/IN replacing the native integrase. The 
retroviral vector is then introduced by transfection 

30 into a cell of interest. Similarly, a replication- 
competent vector comprising C-terminal DB/IN fusions 
and the foreign DNA of interest, may be introduced into 
a cell line by transfection or by a variety of other 
methods known in the art for introducing DNA and/ or 

35 proteins into a cell. Such methods include, but are 

not limited to: (1) calcium phosphate co-precipitation ; 
(2) DEAE-dextran treatment; (3) electroporation; (4) 
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biolistic delivery (i.e. , bombardment of cells or 
tissues with DNA-coated raicroparticles) ; (5) 
microinjection; (6) "scrape-loading," as described by 
McNeil et al.,.J. Cell Biol. 98: 1556-1564 (1984) ; and 
5 (7) liposome- or erythrocyte-mediated transf ection. 
These and other currently-available methods may be 
utilized on cultured or non-cultured cells, excised 
tissues or within a living organism. 

The chimeric enzymes of the present invention 
10 wil1 find their broadest utility as general enhancers 
of the integration reaction and as facilitators of 
targeted integration of foreign DNA into host DNA 
within cultured cells or within cells of a living 
organism. The enzymes will be useful as a research 
15 tool to study the effect of insertional disruption or 
enhancement of one or more genes within the natural 
cellular environment. As a specific example, 
transcription factors, which are the subjects of 
intensive research, are known to regulate groups of 
20 genes at different stages of development or at 
: different times in a cell cycle. These genes are 

"turned on" or "turned off" at specific developmental 
stages or times, such regulation orchestrating the 
expression of genes under control of the transcription 
25 factors. It is now known that these factors are open 
for activation by certain changes occurring in the 
chromosomal DNA which enable binding of DNA binding 
proteins that activate expression of the transcription 
factors. since several of these DNA binding proteins 
30 have already been isolated and characterized, their DNA 
binding domains can be utilized in construction of 
DB/IN proteins of the invention. Because the protein 
is directed to a binding domain which is only "open" at 
certain times, the chimeric enzyme may be used as a 
35 probe to explore when, or at which developmental 

stages, a particular transcription factor is open for 
activation. 
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As another specific example, the DB/IN 
chimeric enzymes will be useful as a diagnostic and/ or 
therapeutic tool, e.g. , for disrupting genes known to 
have a detrimental effect. For instance, certain 
5 regulatory proteins are involved in cellular 
proliferation. For such proteins in which an 
activating DMA binding protein is known and has been 
character i z ed (for a list of examples , see Latchman, 
1991, supra, at page 155) , chimeric enzymes can be 

10 produced in accordance with the present invention to 

specifically disrupt expression of such genes by stable 
integration of a foreign DNA at the binding locus , 
thereby permanently inactivating the detrimental gene. 
In comparison, many gene- or RNA- inact i va t ing 

15 strategies involve orily transient measures , such as 

introduction of antisense molecules or ribozymes, and 
do not permanently inactivate a gene. In addition, as 
it becomes feasible to design DNA binding domains that 
bind to selected DNA recognition sequences or other 

20 characteristic determinants , DB/IN enzymes may be 

constructed that catalyze integration into a specific 
target site (e.g. , at innocuous sites between expressed 
genes} , to enable gene addition therapy without 
disrupting normal gene expression* 

25 The use of the chimeric enzymes of the 

invention to disrupt or augment gene expression can 
first be explored in cultured cells, and thereafter can 
be utilized in living organisms. In this regard , the 
chimeric enzymes of the invention can be used to 

30 particular advantage for gene transfer in germline 
cells, such as bone marrow or stem cells from 
peripheral blood. These cells can be genetically 
manipulated ex vivo during the course of a normal 
autologous stem cell transplantation procedure . 

35 The DB/IN enzymes of the present invention 

should be of substantial utility in facilitating site 
specific integration of foreign DNA molecules into host 
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DNA In vitro, in cultured cells, and in living animals. 
This is a significant advance in the art with respect 
to stable transformation of eucaryotic DNA by way of 
retroviral vectors, which heretofore were capable only 
5 of catalyzing random integration events. 

The following example is provided to describe 
the invention in further detail. This example is 
intended to illustrate and not to limit the invention. 
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Example i 

In this example, we describe in detail 
chimeric IN proteins containing the lexA DNA binding 
domain (DBD) fused precisely at the N- or C-termini of 
ASV IN, or within the N-terminal ZF domain of ASV in. 
These fusion proteins demonstrate great potential for 
enhancing targeted retroviral integration into host 
DNA. 
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MATERIALS AND METHODS 

Construction and purification of LexA- IN 
fusion proteins. We constructed a set of six LexA-ASV 
IN fusions in which the LexA DBD replaced various 
portions of the N-terminus of IN or was fused directly 
to the N— or C-termini (Figure 1) . The sequence 
encoding the LexA DNA binding domain was derived from 
the plasmid pRB451, kindly provided by R. Brent (the 
sequence is also publicly available from the GENEBANK- 
National Center for Biotechnology Information, 
Accession No. J01643-V00299-V00300; see also Horii et 
al., Cell 22.x 689-697 1981) . PCR primers were designed 
to amplify the region corresponding to the LexA DBD, 
codons 1-87. For N-LIN, 2?coRI sites were engineered 
into both PCR primers such that the LexA DBD coding 
region could be inserted into the unique ScoRI site 
located at the 5 '-end of the RSV IN reading frame in 
the expression vector P RC23IN ( P RC23 is a commercially 
available expression vector; pRC23IN is described as 
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"pRC23-p32" by Terry et al. , J- Virol. 62 : 2358-2365 , 
1988) • N~terminal deletion mutations in which residues 
5-52 or 5-39 of RSV IN have been deleted were described 
elsewhere (Kulkosky et al. , Virology ,206: 448-456, 
5 1995) . These N-terminal deletions were assembled into 
pRC23IN. In the deletion constructs , the JScoRI site at 
the 5 1 -end of IN has been retained along with codons 
for the first four IN amino acids. The LexA DBD 
sequence, adapted with JScoRI site, was inserted at the 
10 unique JScoRI site in these two constructs, producing 

the construct LIN 39-286 and LIN 52-286 (Figure 1) . A 
similar approach was used to construct LIN 39-207 from 
39-207, which was described elsewhere (Kulkosky et al. , 
1995, supra) . 

15 The LIN 18-<286 was constructed using a 

different strategy . The C-terminal border of the LexA 
DBD sequence is defined by an XmnX site, so an XmnT 
partial digest was performed on pRB451. The PstI site 
in the amp gene is common to pRC23IN and pRB451 and 

20 this was used to transfer a small cassette containing 
the Tac promoter from pRB451 along with the first 87 
amino acids up to the Xmnl site. This fragment was 
cloned into pRC23IN that had been cleaved with BssHTT, 
repaired and cleaved with Pstl. The BssHII site 

25 corresponds to condon 17 of IN. Ligation of the blunt 
Xmnl site to the repaired BssHII site resulted in an 
in- frame fusion between the LexA DBD codon 87 and ASV 
IN codon 18. An analogous deletion (IN 18-286) , 
missing IN codons 5-17, was constructed by cleaving 

30 pRC23IN with BssHII and jgrcoRI and inserting a small 

jJcoRI /BssHII linker that reestablished only the first 
four codons . 

The C-LIN (Figures 1 and 6) was constructed 
in a manner that was amenable for reassembly into the 

35 viral genome . The LexA DBD coding region was amplified 
from the plasmid pRB451. The PGR primers included a 
BanXI site on the upstream side and an Aflll site on 
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the downstream side. The downstream PGR primer 
contained a stop codon which would terminate the LexA 
DBD after amino acid residue 87. The Banll/AxlII- 
adapted LexA DBD fragment was inserted into P SP73ASV- 
5 151 {B ° UCk Gt al " M °l^' Cell. Biol. 15: 2663-2671 
1995) which contains the C-terminus of integrase as' 
well as an adapter that separates the IN coding region 
from the overlapping env reading frame. The adapter 
also contains a single nucleotide change that was 
10 selected In vivo, which restores regulated splicing at 
the neighboring env splice site. 

The proteins diagrammed in Figure 1 were 
purified from B.coli as described previously (Kulkosky 
et al., 1995, supra), except for LIN 18-286, which was 
purified by an earlier method (Terry et al., ig 8 8, 
supra). All proteins were soluble and behaved 
similarly to the wildtype ASV in. 

LexA operator DMA. For the DNA nicking 
experiments shown in Figure 5, a consensus lexA 
operator segment was present which was in a ca. 750bp 
Bamni/sall fragment from the plasmid 1107 kindly 
provided by R. Brent, (Brent & Ptashne, Cell £3: 729- 
736, 1985). The operator region was transferred to 
PBR322 by replacing the BamKJ/salX fragment. This 
25 plasmid is denoted pBRBSLexAOP . The lexA operator had 
Xhol ends and was inserted into an xhol site, when 
partial digests were carried out to confirm the 
presence of the 24 bp operator, we noted a ladder of 
five bands, consistent with the presence of four tandem 
30 operators. Sequence analysis confirmed the presence of 
multiple operators. 

A second plasmid was constructed containing 
two tandem lexA operators, a synthetic 63mer 
oligonucleotide duplex was prepared containing: BamUl 
35 and S * 1X restriction site for cloning into P BR322, two 
consensus operators with upstream and downstream 
spacers and a spacer sequence between the two operators 
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derived from the natural lex A tandem operators (Figure 
1) . The resulting plasmid pBR7ll-2 was sequenced to 
confirm the presence of the operator segment . 

Processing assay. LexA-IN fusions were 
5 assayed for the ability to process viral DNA ends using 
a model substrate described by Katzman et al., J. 
Virol. 61: 5319-5327 (1989) . Briefly, the substrate 
consists of a synthetic duplex of 18bp analogous to an 
end of linear viral DNA. Incubation with IN in the 

10 presence of 3mM MnCl 2 results in specific nicking two 
nucleotides from the 3 f end of one strand 
("processing") . This reaction removes the TT 
dinucleotide to produce a recessed end and exposes the 
highly conserved CA dinucleotide, which becomes joined 

15 to the host DNA. The 5 * -end of the target strand is 
labeled and the product , which is two nucleotides 
shorter , is detected by electr ophores is on a denaturing 
gel* 

Plasmid nicking assay. Assays for concerted 

20 nicking were carried out as described by Terry et al. , 
1988, supra, with some modifications . The reaction mix 
contained 20mM Tris-HCl pH 7-4, 3roM MnCl 2 , 100 ngs of 
supercoiled plasmid DNA or linear DNA, and 5 pmoles of 
IN or lexA-IN protein. Reactions were incubated for 

25 0.5 to 2 hours and were loaded on a 1% agarose gel. We 
observed that de-proteinization did not affect the 
results, so this step was not included. The gels were 
stained with ethidium bromide and visualized under 
ultraviolet light. 

30 Assay for integration into pBR322 targets 

containing lexK operators. Integration of the model 
viral DNA duplex into a heterologous target was 
measured using a FCR-based assay similar to that 
described by Pryciak et al. , EMBO J. 11 ; 291-303 

35 (1992) • The target DNAs, pBR322 derivatives, contain 
the lexA operator region. To measure integration into 
a specific region, a "target" PGR primer was selected, 
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which was ca. 250bp from the lexA operator region. The 
"viral" primer was identical to the lent DNA strand to 
be joined to the target DNA. The length of the PGR 
product corresponds to the distance from the target 
5 primer site to the integration site. Integration 
assays contained approximately 0.2 pmoles of PstI 
linerarized plasmid target DNA, 35 pmoles of IN or IN- 
lexA fusion protein and 0.1 pmol of viral DNA ends 
(model substrate) . The reaction mixture contained 2 0mM 
10 Tris - HC1 and 5mM M< 3Cl 2 . The viral DNA substrate was a 
18mer/l6mer duplex with a recessed end. This 
corresponds to a processed viral DNA end, as described 
previously (Katzman et al. , 1989, supra) . The viral 
DNA substrate and IN were incubated on ice for 10 
15 minutes and the targeted DNA and metal were then added. 
Reactions were carried out for 60 to 90 minutes at 
37 "C. The reaction mix was treated with Proteinase K 
(200^g/ml f.c.) and 0.5% SDS for one hour at 37 *c. 
Carrier tRNA was added and nucleic acids were purified 
by phenol extraction and concentrated by ethanol 
precipitation. The target DNA, now containing inserted 
viral DNA ends, was resuspended in lOO/il 200 mM Tris- 
HC1 (PH 7), l aM EDTA. Recovery of target DNA was 
monitored by agarose gel electrophoresis. Samples (5 
25 to 10M1) were removed for PGR analysis. Standard PGR 
conditions were 1 minute at 94-C, i minute at 37* C and 
1 minute at 72<»C, 30 cycles. The target PGR primer was 

P end-labelled and viral PGR primer was unlabelled. 
Products were analyzed on 7% acrylamide-urea sequencing 
gels. Fragment sizes were determined using a 
0X174£faeIII digest and a lObp ladder (Bethesda Research 
Labs) . 

Transfections, virus replication assays and 
western blot analysis. Transfections of chicken embryo 
35 fibroblasts was carried out as described by Katz & 

Skalka, Mol. Cell Biol. i£: 696-704 (1990), using the 
DEAE-dextran method. Virus production was monitored 
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using a standard reverse transcriptase assay. For 
protein analysis, virus particles were pelleted from 
10ml of culture supernatant and proteins were 
fractionated op SDS polyacrylamide gels. Western blots 
5 were carried out as described previously using rabbit 
polyclonal antibody directed against the LexA repressor 
protein and rabbit polyclonal antibody directed against 
bacterially-produced ASV IN ("p36") (Stewart & Vogt, J. 
Virol- 6£: 6218-6231, 1991) • 

10 

RESULTS 

Construction of lexA repressor BBD—IM fusions 
proteins . The lexA repressor is composed of a DNA 
binding domain (DBD, residues 1-87) and a dimerization 

15 domain (residues 88-202) . High affinity binding of the 
repressor to cognate operators requires dimerization. 
We designed several truncated and full length ASV IN 
proteins containing the LexA repressor DBD (Figure 1). 
Our initial premise was that the N— and C~ termini of 

20 IN may be involved in recognition of host or viral DNA r 
and by replacing either terminus , we could determine 
whether a heterologous DNA binding domain could 
complement IN function. The isolated central domain of 
ASV IN retains a nonspecific endonuclease and a 

25 cleavage- 1 igat ion activity; therefore this domain must 
be capable of DNA binding independently of the N— and 
C-termini . The central domain of IN also contributes 
to dimerization and therefore could provide the 
dimerization function required for high affinity 

30 binding of the lexA DBD. We fused the lexA DBD to the 
N-terminus of ASV IN at residues 18, 39 and 52 f which 
resulted in partial or complete removal of the ZF 
domain- In addition, we designed several analogous IN 
deletions which lacked the lexA DBD. The lexA DBD was 

35 also fused to the central domain, 39-207. Lastly we 
constructed two fusions in which the LexA DBD was 
present at either the N- or C-terminus (n-LIN and O 
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LIN, respectively) . All of the fusions shown in Figure 
1 were soluble when expressed in E. coll. Two 
additional constructs were designed to produce protein 
in which the LexA DNA binding domain was either 
5 tG thered at the N-terminus through a spacer or fused 
near the C-tenninus (not shown) . Proteins produced 
from these clones could not be purified by standard 
methods and were not analyzed further (not shown) . 

Processing and andonuclease activities of 
10 fU8ion Plains. The chimeric proteins and deletion 
mutants shown in Figure 1 were assayed for processing 
activity In vitro using standard model substrates 
(Figure 2) . Wild type ASV IN is able to cleave the DNA 
specifically following the conserved CA dinucleotide as 
15 indicated. Fusion of the LexA DBD to IN amino acid 
positions is or 39 resulted in retention of wildtype 
processing activity (IN 18-286, IN 39-286) . Joining 
activity, manifested by insertion of the newly formed 
ends in other substrate molecules, could also be 
20 detected on longer exposures of the gel (data not 

shown) and was similar to wildtype. Similar activities 
were noted with constructs IN 18-286 and IN 39-286, 
indicating that activity did not depend on the presence 
of the LexA DBD (data not shown for the IN 39-286 
25 protein) . Fusion of the LexA DBD at residue 52 (LIN 
52-286) , however, resulted in a dramatic reduction of 
processing activity. The non-fused version of 52-286 
appeared to be unstable in Z. coll and thus could not 
be assayed. 

30 We Previously observed that deletion of the 

C-tenninus of IN (IN 1-207 construct) resulted in loss 
of processing activity, but retention of a Mn ++ - 
dependent endonuclease activity which cleaves between 
the C and A (»-3» cut), as well as other sites 

35 (Kulkosky et al. , 1995, supra). The wildtype ASV IN 
also displayed the «-3« cutting activity (Figure 2). 
This activity is a property of ASV IN, as indicated by 
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specific inhibition by certain anti-ASV IN monoclonal 
antibodies. The IN 39-207 and IN 52-207 fragments 
retained this "-3" cutting activity (Figure 2, data not 
shown for IN 52-207) • The central domain fragments also 
5 retained a cleavage- ligation activity, similar to the 
"disintegration 11 activity originally described by Chow 
et al. , Science 255 ; 723-726 (1992) . Fusion of the 
lexA DBD to the 39-207 fragment resulted in retention 
of this "-3" cleavage activity. From the results in 

10 Figure 2 , we concluded that the N-terminal ZF-domain 
(1-40) is not essential for processing activity as 
assayed under these conditions. Lastly, the N-LIN and 
C-LIN proteins were assayed for processing and joining 
and displayed processing and joining activities similar 

15 to wild type IN (dataf not shown) . 

The LexA DBD can influence integration site 
selection. Retroviral DNA integration into the host 
cell DNA is essentially random. Some models for IN 
structure-function suggest that a "non-specific DNA— 

20 binding activity" could account for random selection of 
host integration sites. To determine whether the 
presence of a specific DNA binding domain (the LexA 
DBD) could influence integration site selection, IN or 
IN fusions were incubated with linearized target DNA 

25 and a model viral DNA substrate . This substrate was a 
16mer/18mer duplex with a "processed" end (the TT is 
removed, see Figure 2) . A PGR based assay was used to 
score for integration of model viral DNA substrate into 
a plasmid target (see Material and Methods) . The 

30 target plasmid was pBR322 containing two tandem IbxA 
operators inserted between the BamHl and Sail sites. 
Two PGR primers were used to detect integration events * 
One primer corresponds to a fixed site on the plasmid 
and the second primer corresponds to the viral sequence 

35 to be inserted . The size of the resultant PGR 

fragments corresponds to the distance from a fixed site 
on the plasmid to the integration site* It should be 
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noted that this assay detects only insertion of what is 
equivalent to one viral DMA end, rather than the 
coordinated insertion event that takes place in vivo 
involving the .two ends of viral DNA. 
b In the experiment shown in Figure 3 the 

fixed PGR primer was ligated ca. 120 bp from the tandem 
2e*A operators. The wildtype IN (lane 1) catalyzed 
integration into many sites and integration was not 
entirely random, as indicated by the varying 

10 intenSltieS ° f bands ' " Purified LexA repressor 

proteins was added prior to the addition of in the 
lexA operator region became protected from integration 
events, indicating that the repressor is bound to the 
operators (lane 2). a general reduction in integration 

21 ITI n0tSd ' WhACh ^ t0 SOme nonspecific 

bindln ^ of the LexA repressor protein. The 
patterns produced by N-LIN and C-L1N (lanes 3 and 5 
respectively) were identical to each other, but 
differed from the pattern produced by wild type in 
integration events within the l exA operator region were 
quenched. These results indicate that the fusion 
proteins were binding to the 2e*A operators and thus 
blocking integration of unbound fusion proteins into 
this region. The use of a neighboring integration site 
was tremendously enhanced in the case of both N-LIN and 
C-LIN (arrow) . Significant enhancement of one other 
site closer to the primer was also noted (arrow) 
These results suggest that LexA DBD directs the N-LIN 
and C-LIN fusion proteins to the operator region and 
that the bound fusion proteins direct integration into 
nearby sites. Preincubation with eguimolar amounts of 
LexA repressor protein was not sufficient to block this 
activity, indicating that the fusion proteins are 
tightly bound (lanes 4 and 6). one alternative 
explanation for the enhanced use of integration sites 
would be that the bound proteins distort the DNA helix 
in the vicinity of the operator and this may promote 
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selective utilization of sites in this region- However 
binding of the complete LexA repressor protein to the 
operator has no such effect on wild type IN (Figure 3, 
lane 2) - since tight binding to the operator requires 
5 dimerization of the LexA DBD, we also conclude that the 
IN portion of the fusion provides a diiner interface. 

Interdomain flexibility may be required for 
selection of neighboring integration sites. The 
results in Figure 3 suggested that LexA DBD-IN fusions 
10 were bound at the operator region and that this binding 
promoted use of neighboring integration sites. Since 
two of the N- terminal intra-domain fusions, LIN 18-286 
and LIN 39-286 , were active for processing and joining 
(Figure 2), we used the PCR-based assay to access 
.15 their ability to targfet integration into sites adjacent 
to the operator . As controls, we assayed the 
corresponding deletion mutants , IN 18-286 and IN 39- 
286. The results shown in Figure 4 indicate that the 
N-terminal deletions did not abrogate the ability of 
20 these proteins to direct integration into a 

heterologous target DNA (lanes 2 through 4) . We 
therefore conclude that the complete ZF domain is not 
required for integration into a heterologous target DNA 
in vitro. However , the results indicate that, as 
25 compared to wild type, the deletions result in a slight 
reduction in integration activity, as well subtle 
variations in integration site selection (compare lane 
2 with lanes 3 and 4) . The corresponding LIN 18-286 
and LIN 39-286 proteins displayed quenching of 
30 integration events in the operator region, indicating 
that these proteins are selectively bound at these 
operator sites (lanes 7 ad 8) . However , the 
neighboring integration site (arrow) is no longer 
preferred as in cases of N-LIN or C-LIN ( lanes 5 and 
35 6) . The LIN 52-286 and LIN 39-207 proteins showed no 
activity under these conditions ( lanes 9 and 10) , as 
might be expected from the results in Figure 2. We 
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conclude that fusion of the LexA DBD within the ZF 
domain results in a highly active protein which can 
bind to lexA operators; however fusion at the precise 
N-terminus allpws the selection of preferred sites 
5 neighboring the operator. One interpretation is that 
the precise N-terminal (and C-terminal) fusions allows 
sufficient flexibility for the IN domain to access 
neighboring sites, while intra-domain fusion does not. 

Targeting of IN endonuolaase activity. We 
10 used a second approach to assess the selective binding 
and activity of the LexA DBD— IN fusions at the lexA 
operator site. With Mn ++ as a cof actor, ASV IN is able 
to nick and linearize supercoiled plasmids. This non- 
specific nicking seems to correspond to cleavage 
15 observed at the -3 site on the viral DNA substrates 
shown in Figure 2. The biological relevance of this 
activity is unknown, but it requires the conserved 
residues in the catalytic domain. The supercoiled 
substrate shown in Figure 5A is a pBR322 derivative 
20 lacking I ear A operator sequences (lane 2) . Two of the 
fusion proteins diagrammed in Figure l, LIN 18-286 and 
LIN 39-286, were incubated with this plasmid in the 
presence of MnCl2 and the products were analyzed on an 
agarose gel (Figure 5A, lanes 4 and 7, respectively) . 
25 Tnese fusion proteins, as well as wild type IN (lane 3) 
and the two non-fused catalytic domain fragments (lanes 
5 and 6) , all displayed similar levels of nicking 
activity as measured by production of form II (FII) and 
a small amount of linear form III DNA (Fill) . These 
30 results confirm that the nicking activity maps to the 
ASV IN catalytic domain and is not affected 
considerably by various deletions and fusions. A 
similar experiment was carried out in Figure 5B, except 
that the supercoiled substrate plasmid, pBRBSLexAOP , 
35 contained tandem lexA operators . Strikingly, the 

efficiency of linearization is much greater with the 
four proteins that contain the LexA DBD (Figure SB, 
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lanes 3 , 4 , 5 and 8 as compared to wild type IN or the 
non-fused central domain fragments (lanes 2 , 6 and 7) . 
The linear product (Fill) appears to be formed at the 
expense of supercoiled substrate (FI) indicating 
5 concerted nicking. Although these assays are not 

highly quantitative, the selectively for the operator 
is most pronounced when the IN 39-207 protein is 
compared to the corresponding LexA fusion protein, LIN 
39-207 (Figure 5B, lanes 7 and 8 , respectively) ♦ 
10 The results shown in Figures 5 A and 5B 

suggest that the enhanced linearization of the 
operator-containing plasmids was due to preferential 
binding of the lexA-IN fusion proteins to the operator 
region, followed by localized concerted nicking. A 
15 linear substrate was * used in order to map the preferred 
cleavage site in relation to a fixed site* The lex A 
operator-containing substrate was linearized with PstX , 
purified and then incubated with unfused IN or LIN 18- 
286. The results, shown in Figure 5C r demonstrate that 
20 the LexA— IN fusion was able to preferentially cleave at 
or near the lexA operator , as indicated by generation 
of two discrete fragments of ca. 34 00bp and 1300bp. 
These fragments migrated with the two marker fragments 
generated by cleavage with Xhol, which cleaves at the 
25 operator sites. All other LexA— IN fusions tested 

showed a similar profile on Pstl-l inear i z ed substrates 
(data not shown) „ The bands produced by LIN 18-286 
were somewhat heterogeneous , indicating that the ends 
might be frayed (Figure 5C, lane 5) . Fractionation of 
30 labelled DNAs on higher resolution gels confirmed that 
the termini produced by LIN 18-186 corresponded to many 
sites within, and flanking, the lex A operator region 
(data not shown) . 

As can be seen in Figure 5C, lane 4, wild 
35 type IN does not produce any major uniform fragments 
from the linear DNA substrate, indicating that double 
strand breaks occur less frequently and less 
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selectively. Interesting, when gels similar to that 
shown in Figure 5C were probed after Southern blotting, 
the results indicated that the major site of double 
strand breaks produced by unfused IN also produced by 
5 unfused IN also corresponded to the lexA operator 
region. We believe that the tandem array of 
palindromic operators may present a highly structured 
region of DNA which may be generally sensitive to 
nucleases and thus may enhance the detection of double 
10 strand breaks by both the fused and unfused IN 
proteins. 

From these experiments, we conclude that the 

Mn-dependent nicking activity of the ASV in catalytic 

domain can be directed to le*A operator sites by fusion 

15 with the LexA DBD and this results in site-specific 
double stranded breaks . 

The C-LIN protein is functional in vivo, one 
object of this invention is to influence, or re-direct 
retroviral DNA integration using a modified IN protein. 
2 0 Since the N-terminus of IN is fused to the RT in the 
gag-pol precursor in all retroviruses, it would be 
difficult to engineer a DBD (e.g. N— LIN ) at this site 
in the viral genome. For example, a new viral protease 
(PR) site would have to be engineered in order to 
release the IN fusion from the gag-pol precursor. 
Also, in the case of ASV, IN exists as a domain of the 
RT 0-chain, as well as a free peptide; thus, the 
heterologous DBD would likely interrupt folding of the 
RT 0-chain. We therefore constructed an ASV DNA clone 
which encodes C-LIN, such that the DBD was tethered to 
the C-terminus of IN as well as to the C-terminus of 
the RT 0-chain. This required re-engineering of the 
3 '-end of the pol gene in a manner which would not 
disturb cis-acting signals or overlapping coding 
35 regions. The same strategy was used to generate the 
bacterial expression construct and the relevant 
fragment was simply shifted to the viral DNA clone. 
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Normally, the IN coding region partially overlaps with 
the env coding region (Figure 6), The overlapping 
portion encodes a C-terminal ca. 4kDa peptide that is 
removed by the^ viral protease (PR) during virion 
5 morphogenesis and this peptide was shown to be 
nonessential (Katz & Skalka, 1988, supra). We 
therefore replaced the coding sequences for this 
peptide with the coding sequence for the LexA DBD. The 
cleavage site for PR should be destroyed by the fusion 

10 step. In order to preserve the intronic portion of the 
env splice acceptor site (which also overlaps with the 
IN coding region) we used a starting clone pLD6ISl 
(Katz & Skalka, 1988, supra) which contains a noncoding 
spacer between env and pol (figure 6, denoted nc) » The 

15 spacer also contains /a cis-acting suppressor nutation 
which was selected in vivo to maintain RNA splicing 
regulation. 

DNA transf ection of susceptible chicken cells 
with the C-LIN viral construct resulted in the 

20 appearance of infectious virus (Figure 7 A) * There was 
significant delay as compared to wild type, indicating 
that the alterations produced a replication defect. 
The delay was not due to use of the pLD6ISl parent, 
since this virus replicates at a similar rate as wild 

25 type. These results indicated that viruses containing 
the additional IN domain have some replicative 
capacity; however the delay, followed by rapid 
appearance of infectious virus (between days 15 and 18) 
suggested the possibility of a genetic change that 

30 restored full replicative capacity. In comparison, 

replacement of the conserved carboxylates residues of 
the catalytic domain (Figure 1) results in a complete 
block in ASV replication when assayed under similar 
conditions • 

35 Western blot analysis (Figure 8) using both 

anti-IN and anti-LexA repressor antibodies demonstrated 
that viruses from pLDSISl C-LIN-transf ected cultures 
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contained a mixture of c-lin (ca. 42kDa) and IN-like 
proteins (32 KDa) . At day 30 post transf ection, the c- 
LIN to IN ratio was approximately 1 to 5 (Figure 8A) . 
As predicted, the LexA DBD is also detected on the fi- 
5 chain of RT (Figure SB) . In similar western blots from 
virus collected at earlier times (e.g., days 16 and 17) 
the C-LIN to IN ratio was approximately to one-to-one 
(data not shown) , suggesting a gradual loss of the LexA 
DBD. The free LexA DBD could not be detected in viral 
particles (data not shown) suggesting that the loss of 
this domain was the result of a genetic change, rather 
than proteolytic cleavage. To confirm the apparent 
genetic instability of the C-LIN virus, culture 
supernatant were collected at day 25 from the 
transfections shown in Figure 7A, normalized by reverse 
transcriptase activity and then re-applied to fresh 
cells. As expected, virus collected from the C-LIN 
cultures now appeared at a similar rate to wild type. 
We note that the band corresponding to in in the c-LIN- 
20 trans feet ed cultures does not appear as the 

characteristic "doublet" of wild type IN. This 
observation suggest that the restored IN is not simply 
wild type and further analyses are under way. From 
these experiments we concluded that the C-LIN protein 
25 is produced in vivo and is assembled into virus 

particles. Although there is genetic selection against 
viruses that encode this domain, the C-LIN protein is 
likely functional in vivo (see Discussion) and is 
relatively stable. 
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DISCPSSTO M 

The selection of retroviral integration sites 
is essentially random within host cell DNA, when 
catalyzed by a wildtype integrase. The evidence 
presented here demonstrates that IN can be augmented 
such that it is directed to a specified DNA sequence. 
This approach allows the study and use of IN molecules 
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bound at a particular site on the DNA. 

We have shown that fusion of the LexA DBD to 
either the N- or c~termini of ASV IN strongly 
influences integration site selection in vitro. The 
5 results are consistent with the binding of these fusion 
proteins to l&xA operator sequences followed by 
enhanced the use of an adjacent integration site. The 
preferred integration site is located approximately one 
turn of the DNA helix from the lexA operator sites. 
10 Fusion of the LexA repressor to the precise IN termini 
appears to be important for preferential integration 
into neighboring sites. For example, LIN 18-286 and 
LIN 39-286 are active for integration and are also able 
to bind to operators (Figure 4) ; however , these fusions 
15 do not preferentially select the neighboring 

integration site used by N-LIN (and C-LIN) . One 
interpretation of these results is that there must be 
sufficient flexibility between the two fusion partners 
such that the neighboring integration site can be 
20 accessed • Interestingly, the intra-domain fusions LIN 
18-286, LIN 39-286 and LIN 52-286 were able to 
selectively introduce double strand breaks within or 
adjacent to the operator region, indicating that the 
juxtaposition of domains is not critical for this 
25 activity (Figure 5) - Alternatively , the double-strand 
break assay may be inherently less quantitative and 
perhaps more sensitive. 

The N— LIN and C-LIN fusions both selected the 
same preferred integration sites adjacent to the 
30 operators . This result suggests that these two fusions 
may be similarly positioned on the DNA, consistent with 
the notion that N- and C-termini are adjacent in the 
native IN structure. 

The observation that all of the fusions 
35 proteins can be efficiently targeted to the operator 
region suggests that the IN domain provides a dimer 
interface. The LexA repressor normally binds to its 



10 



15 



WO 96/35777 PCT/US96/06569 

- 37 - 

cognate operator as a dimer, with the DBDs from each 
monomer contacting a "half -site" of the operator 
sequence (Figure 1). The LexA repressor DBD (1-87) has 
been used extensively as a fusion partner to detect 
5 activation domains of transcription factors- To be 
efficient, the system requires that a multimerization 
domain be present. The positioning of the LexA DBD in 
relation to the dimerization domain is also important: 
the two DBDs in the dimer must be able to contact both 
halves of the operator simultaneously for high affinity 
binding. ASV IN functions as a multimer and 
biochemical mapping studies have indicated that self- 
association functions are contributed by both the 
catalytic domain and the C-terminal domain. Genetic 
analysis has also indicated that the catalytic domain 
of HIV IN contains a self -association function. The 
results presented here are consistent with the 
catalytic domain providing a dimer interface. 

We also observed that fusion of the LexA DBD 
to positions 18 and 39 of ASV IN (LIN 18-286, LIN 3 9- 
286) resulted in retention of processing and joining 
activity, while fusion at position 52 (LIN 52-286) 
resulted in a severe reduction in activity (Figures 2 
and 4). These fusions interrupt the conserved ZF 
25 domain. Although the isolated ZF domain does not bind 
to DNA, it has been implicated in DNA binding within 
the context of the whole IN protein. To determine if 
the LexA DBD might be complementing ZF domain function, 
we analyzed the corresponding deletion mutants (IN 18- 
30 286, IN 39-286) . We find that these mutant proteins 
are also active, although their activity may be 
slightly reduced as compared to their fusion 
counterparts (Figure 4). Thus, the ZF domain is not 
essential for processing or joining to a heterologous 
35 target DNA in vitro. Bushman and Wang (J. Virol. ££: 
2215-2223, 1994) have reported similar results, except 
that the fusion of short heterologous peptides was 
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required to complement: -the ZF deletion. We showed 
previously that the ASV ZF domain was not sufficient 
for DNA binding, although we did detect modest effects 
of single amino acid substitutions of either histidine 
5 9 (H9) or histidine 13 (H13) residues (see Figure 1) in 
the presence of MgCl 2 - Here we show that the various 
ASV ZF deletion and fusion constructs are capable of 
catalyzing the insertion of model viral DNA ends into a 
heterologous target DNA in the presence of MgCl 2 , Mn+ + 

10 is the preferred metal co-factor for ASV In in vitro 

and thus the use of the more biologically relevant , but 
less active, Kg** co-factor (Figures 3 and 4) provides a 
more stringent screen for the effects of these 
mutations. Although the ZF domain is not required in 

15 vitro, its conservation suggests that it does indeed 

have an important function. Substitution of ASV H9 or 
H13 produces a severe replication defect in vivo (not 
shown) , which agrees with similar experiments in the 
HIV-1 system. 

20 We provide evidence that all of the IN-LexA 

DBD fusions described here have varying degrees of 
selectivity for operator sequences . Although the 
relative binding of native IN versus fused IN was not 
directly measured, the results shown in Figure 5 

25 indicate an approximate 5-10 fold selectivity for 
introducing double strand breaks at or near the 
operator region. How might the LIN fusion proteins be 
arranged on the operator region to promote cleavage of 
both strands? Under limited nicking conditions, we 

30 observe many breaks focused within the operator region, 
but emanating to more distant sites. This suggests, 
that IN or IN segments are able to make DNA contacts at 
some distance from the operator . The contact could be 
through aggregation of LIN molecules along the DNA or 

35 looping in of neighboring DNA sequences through binding 
to the catalytic domain. Although these possibilities 
cannot be distinguished from the experiments performed 
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thus far, we generally believe that the enhancement of 
double strand breaks is the result of a high local 
concentration of LIN molecules at the operator region. 

The fusion proteins described here enhance 
5 integration at a specific site, but integration events 
are not dependent on the operator sequences. Achieving 
absolute operator-dependent integration may require 
optimizing the arrangement of the IN dimer interface 
and the LexA DBD such that the cooperative interactions 
that occur between the native LexA repressor protein 
dimer and the operator can be fully reproduced. 
Greater selectivity might also be achieved by 
inactivating the IN domain that is normally responsible 
for binding to random host DNA. This non-specific DNA 
binding activity may map to the C-terminus of in (Khan 
et al., Nucl. Acids Res. 19: 851-860, 1990) inasmuch as 
removal of this entire domain results in increased 
selectivity in the double strand break assay shown in 
Figure 5 (LIN 39-207). Perhaps the arrangement of le*A 
operator sequences could also be optimized. m the 
experiments shown in Figures 4 and 5, we used two 
tandem operators that are in phase along the helix 
(Figure 1) . We speculated that this arrangement might 
promote higher order interactions between fusion dimers 
25 bound at each operator. It is possible that higher 
order IN complexes (e.g. tetrameres) are the active 
form for the joining reaction of the viral ends to 
target sequences and this tandem operator arrangement 
might increase IN activity by promoting such complexes. 
Although this hypothesis has not been systematically 
tested, we have found that single operators or tandem 
out-of -phase operators (such as used in Figure 5) are 
sufficient for binding and double strand break 
activity, but not for the efficient, selective joining 
activities of C-LIN and N-LIN displayed in Figures 3 
and 4. 

The fusion protein strategy we have described 
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has also shown potential for -targeting or enhancing 
retroviral integration in viva. The C-LIN protein is 
highly active in vitro, is able to bind to lexA 
operators and promotes integration into nearby regions . 
5 we have also shown that it can be assembled into viral 
particles (Figure 8) , and our experiments indicate that 
the OLIN fusion is active in vivo. Under our standard 
transf ection conditions (Figure 7) , RT activity could 
not be detected unless multiple rounds of infection 

10 have occurred . For example, DNA transf ection with an 
IN catalytic site mutant did not produce sufficient RT 
activity for detection (data not shown) because there 
was no significant amplification by cell-to-cell 
spread • Under our standard conditions, catalytic site 

15 IN mutants do not revert, presumably because this would 
require low- level replication. The results in Figure 7 
suggested that pLD6ISl OLIN virus was partially 
defective due to the presence of the extra IN domain, 
but could replicate sufficiently such that stabilizing 

20 mutations arose. Results of western blot analysis at 
early and late times support this interpretation . We 
conclude that the C-LIN protein is likely functional in 
vivo. We also considered the possibility that C-LIN 
molecules undergo proteolytic cleavage within the viral 

25 particle such that the LexA DBD is released and that 
the resulting nonfused IN is the biologically active 
form* However , we have not detected the expected ca.87 
amino acid peptide in viral particles. 

In summary , our results demonstrate the 

30 potential for augmentation of IN activity in vitro and 
in vivo. It can be postulated that an additional IN 
domain could also be used to target the retroviral pre- 
integration complex to protein components as well as 
DNA targets . The targeting and /or enhancement of 

35 retroviral integration should be extremely useful for 
basic research, to probe the interrelationship between 
various genes in complex gene expression pathways , as 
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well as for gene therapy. 



The present invention is not limited to the 
embodiments described above, but is capable of 
variation and modification without departure from the 
scope of the appended claims. 
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SEQUENCE LISTING 



(1) GENERAL INFORMATIONS 

(i) APPLICANTS Katz, Richard A. 

Skalka, Anna Marie 

(ii) TITLE OF INVENTION: Chimeric Enzyme for Promoting Targeted 
Integration of Foreign DNA into a Host Genome 

(iii) NUMBER OF SEQUENCES s 5 

(iv) CORRESPONDENCE ADDRESS s 

(A) ADDRESSEES Dann, Dorfman, Herrell & Skillman 

(B) STREETS 1601 Market Street, Suite 720 

(C) CITY s Philadelphia 

(D) STATES PA 

(E) COUNTRY s USA 

(F) ZIP: 19103 

(v) COMPUTER READABLE FORM: 

(A) MEDIUM TYPEs Floppy disk 

(B) COMPUTERS IBM PC compatible 

(C) OPERATING SYSTEM I PC-DOS/MS-DOS 

(D) SOFTWARES Patent In Release #1*0, Version #1.30 

(vi) CURRENT APPLICATION DATAs 

(A) APPLICATION NUMBER: US 

(B) FILING DATE: 09-MAY-1995 

(C) CLASSIFICATIONS 

(viii) ATTORNEY /AGENT INFORMATION: 

(A) NAME: Reed, Janet E* 

(B) REGISTRATION NUMBER i 36,252 

(ix) TELECOMMUNICATION INFORMATION: 

(A) TELEPHONE s (215) 563-4100 

(B) TELEFAX: (215) 563-4044 



(2) INFORMATION FOR SEQ ID NOsls 

(i) SEQUENCE CHARACTERISTICS: 

(A) LENGTH: 63 base pairs 

(B) TYPEs nucleic acid 

(C) STRANDEDNESSs double 

(D) TOPOLOGY s not relevant 

(ii) MOLECULE TYPEs DNA (genomic) 
(iii) HYPOTHETICAL s NO 
(iv) ANTI-SENSE s NO 



(xi) SEQUENCE DESCRIPTION s SEQ ID NOsls 
GGGGTCGACA TTACTGTATA TATATACAGC ATAACTGTAT ATATATACAG TATAGGATCC 
GGG 
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(2) INFORMATION FOR SEQ ID NO: 2: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 11 amino acids 

(B) TYPE: amino acid 

(C> STRANDEDNESS: not relevant 
(0) TOPOLOGY : • not relevant 

(ii) MOLECULE TYPE: peptide 

(iii) HYPOTHETICAL; NO 

(iv) ANTI -SENSE: NO 



(3ci) SEQUENCE DESCRIPTION: SEQ ID NO: 2: 

Ser Pro Leu Phe Ala Lye Ala Leu Thr Ala Arg 
1 5 10 

(2) INFORMATION FOR SEQ ID NO: 3: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 39 base pairs 

(B) TYPE: nucleic acid 

(C) STRANDEDNESS: double 

(D) TOPOLOGY: not relevant 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



<*i> SEQUENCE DESCRIPTION: SEQ ID NO: 3: 
CGCGCGAGCC CGTTATTCGC TAAAGCGTTA ACGGCCAGG 
(2) INFORMATION FOR SEQ ID NO: 4: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH: 7 amino acids 

(B) TYPE: amino acid 

(C) STRANDEDNESS » not relevant 

(D) TOPOLOGY : not relevant 

(ii) MOLECULE TYPE: peptide 
(iii) HYPOTHETICAL: NO 
(iv) ANTI-SENSE: NO 



(*i) SEQUENCE DESCRIPTION: SEQ ID NO: 4: 

Val Ala Ala Gly Glu Pro Ala 
1 5 
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(2) INFORMATION FOR SEQ ID NO:5: 

(i) SEQUENCE CHARACTERISTICS : 

(A) LENGTH : 30 base pairs 

(B) TYPE : nucleic acid 

(C) STRANDEDNESS : double 

(D) TOPOLOGY not relevant 

(ii) MOLECULE TYPE: DNA (genomic) 
(iii) HYPOTHETICAL: NO 
<iv> ANTI-SENSE: NO 



(xi) SEQUENCE DESCRIPTION: SEQ ID NO: 5 1 
OCGCGCTTAA GCTGGTTCAC CGGCAGCCAC 
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What is claimed is: 

1. A chimeric enzyme comprising a DNA binding 
moiety and an integrase moiety derivable from a 

5 retroelement, said enzyme being capable of binding to a 
target DNA molecule having a characteristic determinant 
recognized by the DNA binding moiety, said enzyme having 
at least one activity characteristic of a retroelement 
integrase. 

10 

2. The chimeric enzyme of claim 1, wherein 
said retroelement is a retrovirus. 

3. The chimeric enzyme of claim 2, wherein 
said retrovirus is selected from the group consisting of 
ASV, HIV, HFV, Mo-MLV, VI SNA and FIV. 

4. The chimeric enzyme of claim 1, wherein 
said retroelement is Ty-1. 

5. The chimeric enzyme of claim 3, wherein 
said integrase moiety is ASV integrase and said DNA 
binding moiety is obtained from a Lex A DNA binding 
protein. 

6. The chimeric enzyme of claim l, wherein 
said DNA binding moiety is appended to said integrase 
moiety at one terminus of said integrase moiety. 

30 7 - The chimeric enzyme of claim 6, wherein 

said DNA binding moiety is appended to said integrase 
moiety at the amino terminus of said integrase moiety. 

8. The chimeric enzyme of claim 6, wherein 
35 said DNA binding moiety is appended to said integrase 

moiety at the carboxyl terminus of said integrase moiety. 
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9. The chimeric enzyme of claim 1 , wherein 
said integrase moiety is truncated from its amino 
terminus so as to remove part or all of its ZF domain, 
and said DNA binding moiety is appended to said integrase 
5 moiety at said truncated amino terminus. 



10. The chimeric enzyme of claim 1, wherein 
said DNA binding domain is derivable from a DNA binding 
protein selected from the group consisting of helix-turn- 

10 helix proteins and b/hlh/z proteins. 

11. The chimeric enzyme of claim 1, wherein 
said at least one activity of said retroelement integrase 
is selected from the group consisting of: 

. 15 (a) processing of retroelement DNA 

termini ; 

(b) nicking within double stranded DNA; 

(c) integrating a DNA molecule having 
processed retroelement termini into DNA molecule; and 

20 (d) a combination of (a) , (b) or (c) . 

12 . The chimeric enzyme of claim 6, wherein 
said at least one activity of said retroelement integrase 
is integrating a DNA molecule having processed 

25 retroelement termini into said target DNA molecule, and 
said integrating activity is directed to a site on said 
target DNA adjacent to said characteristic determinant 
recognized by said DNA binding moiety. 

30 13. A nucleic acid molecule having a sequence 

that encodes the chimeric enzyme of claim 1. 

14 . A nucleic acid molecule having a sequence 
that encodes a chimeric enzyme comprising a DNA binding 
35 moiety and an integrase moiety derivable from a 

retroelement , said enzyme being capable of binding to a 
target DNA molecule having a characteristic determinant 
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recognized by the DNA binding moiety, said enzyme having 
at least one activity characteristic of a retroelement 
integrase, said nucleic acid molecule comprising a DNA 
binding moiety-encoding segment operably linked to an 
5 integrase-encoding segment derivable from a retroelement. 

15. The nucleic acid molecule of claim 14, 
wherein said retroelement is a retrovirus. 

10 16 • The nucleic acid molecule of claim 14, 

wherein said retrovirus is selected from the group 
consisting of ASV, HIV, HFV, Mo-MLV, VISNA and FIV. 

17. The nucleic acid of claim 14, wherein said 
15 retroelement is Ty-l , 

18. The nucleic acid molecule of claim 16, 
wherein said integrase-encoding segment is obtained from 
an ASV genome and said DNA binding moiety-encoding 
segment is obtained from a gene that encodes a Lex A DNA 
binding protein. 

19. The nucleic acid molecule of claim 14, 
wherein said integrase moiety-encoding segment and said 
DNA binding moiety-encoding segment are operably linked 
such that expression of said nucleic acid molecule 
produces said chimeric enzyme in which said DNA binding 
moiety is appended to said integrase moiety at one 
terminus of said integrase moiety. 

20. The nucleic acid molecule of claim 19, 
wherein said one terminus of said integrase moiety is the 
amino terminus . 



20 



25 



30 



35 



21. The nucleic acid molecule of claim 19, 
wherein said one terminus of said integrase moiety is the 
car boxy 1 terminus . 
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22. The nucleic acid of claim 14 , which upon 
expression produces a chimeric enzyme in which said 

integrase moiety is truncated from its amino terminus so 

as to remove part or all of its ZF domain, and said DNA 

5 binding moiety is appended to said integrase moiety at 
said truncated amino terminus . 



2 3 . The nucleic acid molecule of claim 14 , 
wherein said DNA binding domain-encoding segment is 
10 obtained from a gene encoding a DNA binding protein 

selected from the group consisting of helix-turn-helix 
proteins and b/hlh/z proteins . 

24 . A vector comprising the nucleic acid 
15 molecule of claim 14 .< 

25. The vector of claim 24 , which is a 
retroviral vector . 

20 26. A replication-competent retroviral vector 

comprising the nucleic acid molecule of claim 21* 

27. The vector of claim 24, which is an 
expression vector selected from the group consisting of a 
25 procaryotio cellular expression vector , a eucaryotic 
cellular expression vector and a cell-free expression 
vector . 



30 



28 . A chimeric enzyme produce by expression of 
the nucleic acid molecule of claim 14 . 
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